zookeeper+hadoop+hbase高可用集群搭建

您所在的位置:网站首页 yarn ha配置 zookeeper+hadoop+hbase高可用集群搭建

zookeeper+hadoop+hbase高可用集群搭建

2023-04-23 07:47| 来源: 网络整理| 查看: 265

可以3个组件分开看 zookeeper的高可用集群搭建 hadoop高可用集群部署 hbase高可用集群部署

zookeeper部署参照zookeeper的高可用集群搭建很简单 hadoop+hbase的前提: 1.三台主机相互配置过免密钥(最好都ssh 一次,初次访问会验证【yes】,包括主机本身也需要ssh自己本身) 实验主机: test-39 active namenode Hmaster(主) test-40 namenode(备用) Hmaster(副) test-41 datanode Hmaster(副) 2.本地hosts解析 vim /etc/hosts 10.10.10.39 test-39 10.10.10.40 test-40 10.10.10.41 test-41 3.java环境变量配置 export JAVA_HOME=/usr/local/jdk1.8.0_121 export CLASSPATH=.:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar export PATH=${JAVA_HOME}/bin:$PATH source /etc/profile 开始部署hadoop

三台主机都操作:

1.创建目录 :mkdir /data/hadoop 并解压安装包 :tar -xvf hadoop-2.7.3.tar.gz 2.修改环境变量:vim /etc/profile export HADOOP_HOME=/data/hadoop export PATH=$PATH:$HADOOP_HOME/bin export P2ATH=$PATH:$HADOOP_HOME/sbin source /etc/profile hadoop version (验证) 2.修改配置:(配置位置/data/hadoop/etc/hadoop) 1)hadoop-env.sh export JAVA_HOME=/usr/local/jdk1.8.0_121 export HADOOP_PID_DIR=/data/hadoop/tmp 2)hdfs-site.xml mkdir /data/hadoop/dfsdata dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 test-39:9000 dfs.namenode.rpc-address.mycluster.nn2 test-40:9000 dfs.namenode.http-address.mycluster.nn1 test-39:50070 dfs.namenode.http-address.mycluster.nn2 test-40:50070 dfs.ha.automic-failover.enabled.cluster true dfs.namenode.shared.edits.dir qjournal://test-39:8485;test-40:8485;test-41:8485/mycluster dfs.namenode.edits.journal-plugin.qjournal org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager dfs.ha.fencing.methods sshfence shell(/bin/true) dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.journalnode.edits.dir /data/hadoop/ha/jn dfs.permissions.enable false dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-failover.enabled true dfs.namenode.name.dir.restore true dfs.namenode.name.dir file:///data/hadoop/dfsdata/name dfs.blocksize 67108864 dfs.datanode.data.dir file:///data/hadoop/dfsdata/data dfs.replication 3 dfs.webhdfs.enabled true 3)mapred-site.xml mapreduce.framework.name yarn 4)slaves test-39 test-40 test-41 5)yarn-site.xml yarn.resourcemanager.connect.retry-interval.ms 2000 yarn.resourcemanager.ha.enabled true yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 test-39 yarn.resourcemanager.hostname.rm2 test-40 yarn.resourcemanager.ha.automatic-failover.enabled true yarn.resourcemanager.ha.id rm1 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore yarn.resourcemanager.zk-address test-39:2181,test-40:2181,test-41:2181 yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms 5000 yarn.resourcemanager.cluster-id mycluster yarn.resourcemanager.address.rm1 test-39:8132 yarn.resourcemanager.scheduler.address.rm1 test-39:8130 yarn.resourcemanager.webapp.address.rm1 test-39:8088 yarn.resourcemanager.resource-tracker.address.rm1 test-39:8131 yarn.resourcemanager.admin.address.rm1 test-39:8033 yarn.resourcemanager.ha.admin.address.rm1 test-39:23142 yarn.resourcemanager.address.rm2 test-40:8132 yarn.resourcemanager.scheduler.address.rm2 test-40:8130 yarn.resourcemanager.webapp.address.rm2 test-40:8088 yarn.resourcemanager.resource-tracker.address.rm2 test-40:8131 yarn.resourcemanager.admin.address.rm2 test-40:8033 yarn.resourcemanager.ha.admin.address.rm2 test-40:23142 yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.nodemanager.local-dirs /data/hadoop/dfsdata/yarn/local yarn.nodemanager.log-dirs /data/hadoop/dfsdata/logs yarn.nodemanager.resource.memory-mb 1024 每个节点可用内存,单位 MB yarn.scheduler.minimum-allocation-mb 258 单个任务可申请最少内存,默认 1024MB yarn.scheduler.maximum-allocation-mb 512 单个任务可申请最大内存,默认 8192MB yarn.nodemanager.webapp.address 0.0.0.0:8042 6)core-site.xml mkdir /data/hadoop/tmp fs.defaultFS hdfs://mycluster hadoop.tmp.dir /data/hadoop/tmp io.file.buffer.size 4096 hadoop.proxuuser.hduser.hosts * hadoop.proxyuser.hduser.groups * ha.zookeeper.quorum test-39:2181,test-40:2181,test-41:2181

3台配置基本相同,唯一不一致的yarn-site.xml中的

test-39 rm1 test-40 rm2 test-41 注释 配置好以后可以scp 到另外两台主机(记得改yarn-site.xml)

启动: 1)三台分别启动journalnode (负责NameNode之间信息同步) sh ./sbin/hadoop-daemon.sh start journalnode [root@test-39 hadoop]# jps 74371 JournalNode 2)test-39(active) 上执行格式化操作 hadoop namenode -format 3)格式化ZKFC:将zkfc注册到zookeeper上,在zookeeper集群上创建/hadoop-ha ./bin/hdfs zkfc -formatZK 4)主节点test-39启动: start-dfs.sh start-yarn.sh [root@test-39 hadoop]# jps 74371 JournalNode 74563 DFSZKFailoverController 76146 NameNode 74996 ResourceManager 75116 NodeManager

启动后jps查看: 3台都有NodeManager (负责资源的供给和隔离) 39、40有DFSZKFailoverController (监控,主备切换) 39 、40 会有 NameNode (元数据的仲裁者和管理者) 39 、40 会有resourcemanager (调度器负责资源的分配) 如果没有对应的进程就手动启动:

./sbin/yarn-daemon.sh start resourcemanager ./sbin/yarn-daemon.sh start nodemanager ./sbin/hadoop-daemon.sh start namenode (active上可以直接启动)

如果副节点40的namenode 没有启动,需要复制数据才能启动

需要先 scp -r dfsdata/* test-40:/data/hadoop/dfsdata/ 配置hdfs-site.xml中的目录 dfs.namenode.name.dir file:///data/hadoop/dfsdata/name 千万记住:两个 namenode 节点该目录中的数据结构是一致的 然后启动: ./sbin/hadoop-daemon.sh start namenode 检验: [root@test-39 hadoop]# hdfs haadmin -getServiceState nn1 active [root@test-39 hadoop]# hdfs haadmin -getServiceState nn2 standby

然后3台分别手动添加datanode节点(这步很简单) ./sbin/hadoop-daemon.sh start datanode 访问 http://10.10.10.39:50070看到节点成功加入

hadoop部署总结:

1.我拷贝的开发环境的hadoop包,/data/hadoop/dfsdata/ 有数据,主节点格式化过,所以没问题,另外两个节点启动会各种报错 先关闭 ./sbin/hadoop-daemon.sh stop datanode rm -rf dfsdata/* 启动 ./sbin/hadoop-daemon.sh start datanode (数据存储节点)

hbase集群部署

解压目录:tar -xvf hbase-1.2.3-bin.tar.gz cd /data/hbase

修改配置(3台配置一样)

【1】vim conf/hbase-env.sh

export JAVA_HOME=/usr/local/jdk1.8.0_121 export HBASE_CLASSPATH=/data/hbase/conf export HBASE_MANAGES_ZK=false jdk-8的注释掉

【2】bin/hbase-config.sh 配置java

export JAVA_HOME=/usr/local/jdk1.8.0_121

【3】conf/regionservers 根据主机名配置

cat conf/regionservers test-39 test-40 test-41

【4】conf/hbase-site.xml 连接hadoop和zookeeper

【5】touch conf/backup-masters 增加master高可用

cat conf/backup-masters test-40 test-41

【6】还需要拷贝hadoop的配置文件到hbase目录:

cd /data/hadoop/etc/hadoop/ cp core-site.xml hdfs-site.xml /data/hbase/conf/ 拷贝到本机 scp core-site.xml hdfs-site.xml test-40:/data/hbase/conf/ scp core-site.xml hdfs-site.xml test-41:/data/hbase/conf/ 39上启动: cd /data/hbase sh bin/start-hbase.sh 3台会自动启动HRegionServer 和HMaster (都有) 如果没有配置backup-masters就只会39会有HMaster

页面访问:curl 127.0.0.1:16010可以看到3个RegionServer 和两个back master

hbase部署总结:

部署比较简单的,唯一注意的就是hbase-site.xml中的hadoop节点配置,一定要是当前的active节点。之前写错了,查了半天 hdfs haadmin -getServiceState nn1 (namenode名称)



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3